ARMA / ARIMA

Week 6

Evie Zhang

Old Dominion University

Exam 1

Estimating Models (intercept, trend, kink, ar, ma)

Visualization

Interpretation / Inference

Topics

  • Stationarity
  • Autocorrelation
  • Moving Averages
  • ARMA, ARIMA
  • ARCH, GARCH

Data Generating Process

\[Y = f(X, \epsilon)\]

If we want to model some Data Generating Process (DGP), we need the data we use to train/estimate the model to “represent” the data in the future. We need to know that something in the future will be similar to how it was in the past.

  • Therefore, we need the DGP to be independent of time (stationarity)
  • DGPs with trends or seasonal components are not stationary.

Diagnosing (Non-)Stationarity

To test for stationarity, we have a few options.

  • Visual Inspection

  • ACF, PACF

  • Unit Root test (augmented Dickey-Fuller, KPSS, etc.)

Diagnosing Stationarity

ACF, PACF

  • ACF: Correlation between \(Y_t\) and \(Y_{t-i}\)
    • Hardly decaying correlations in the ACF signal a non-stationary process.
    • “Seasonal patterns” signal a non-stationary process.
  • PACF: Correlation between \(Y_t\) and \(Y_{t-i}\) after controlling for \(Y_{t-1}, \ ... \ Y_{t-i+1}\)
  • There is a degree of “artistry” that comes with reading ACF, PACF plots.

Diagnosing Stationarity

Unit Root test (augmented Dickey-Fuller)

  • \(Y_{t} = \beta Y_{t-1} + e_t\)

  • \(Y_{t} - Y_{t-1} = \beta Y_{t-1} + e_t - Y_{t-1}\)

  • \(\Delta Y_{t} = (\beta-1)Y_{t-1} + e_t\)

  • Test if \((\beta - 1)\) is different from \(0\).

\((\beta - 1) \neq 0 \implies \beta \neq 1 \implies\) no unit root (i.e. stationarity)

Achieving Stationarity

Okay, our time series is non-stationary. What now?

  • Difference \(Y_t\) and \(Y_{t-1}\). Most times, a single difference is fine. However, sometimes we need \(p\) differences. This is interpreted as changes in the outcome variable, instead of levels.
  • Take the log of a variable.

Achieving Stationarity

Achieving Stationarity

Code
data("AirPassengers")
air <- AirPassengers
plot(air, ylab = "Passengers")

Achieving Stationarity

Code
acf(air, lag.max = 36)

Achieving Stationarity

Code
tseries::adf.test(air)

    Augmented Dickey-Fuller Test

data:  air
Dickey-Fuller = -7.3186, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

Achieving Stationarity

Code
plot(log(air), ylab = "log(Passengers)")

Achieving Stationarity

Code
plot(diff(log(air)), ylab = expression(paste(Delta, " log(Passengers)")))

Achieving Stationarity

Code
acf(diff(log(air)), lag.max = 36)

Achieving Stationarity

Code
tseries::adf.test(diff(log(air)))

    Augmented Dickey-Fuller Test

data:  diff(log(air))
Dickey-Fuller = -6.4313, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

Achieving Stationarity

We are still having trouble achieving stationarity with this time series due to its seasonal compenent. If this were (magically / mathematically) removed, we seem to have a constant variance and mean, though.

Let’s try another time series: Monthly Liquor Sales from FRED.

Achieving Stationarity

Code
liquor <- read.csv("../data/liquor.csv")
liquor$date <- mdy(liquor$date)
liquor <- liquor[liquor$date < ymd("2020-01-01"),]
plot(liquor$date, liquor$sales, type = "l",
     xlab = "Month", ylab = "Sales in Millions")

Achieving Stationarity

Code
plot(liquor$date[2:nrow(liquor)],
     diff(liquor$sales), type = "l",
     xlab = "Month", ylab = "Difference")

Achieving Stationarity

Code
plot(liquor$date, liquor$sales_adj, type = "l",
     xlab = "Month", ylab = "Sales in Millions")

Achieving Stationarity

Code
plot(liquor$date[2:nrow(liquor)],
     diff(liquor$sales_adj), type = "l",
     xlab = "Month", ylab = "Difference")

Achieving Stationarity

Code
plot(liquor$date, asinh(liquor$sales_adj), type = "l",
     xlab = "Month", ylab = "log(Sales in Millions)")

Achieving Stationarity

Code
plot(liquor$date[2:nrow(liquor)],
     diff(asinh(liquor$sales_adj)), type = "l",
     xlab = "Month", ylab = "Difference of Log")

Diagnosing Stationarity

Log Seasonally Adjusted Sales Data (Difference)

Code
par(mfrow = c(1, 2))
acf(diff(asinh(liquor$sales_adj)), xlim = c(1, 25), main = "log(Adjusted Sales)")
pacf(diff(asinh(liquor$sales_adj)), xlim = c(1, 25), main = "log(Adjusted Sales)")

\(\text{log}()\text{;} \ \text{sinh}^{-1}()\)

\(Y = \alpha + \beta X + e\)

  • a one unit change in X causes a \(\beta\) unit change in Y.

\(Y = \alpha + \beta \text{log}(X) + e\)

  • a one percent change in X causes a \(\beta\) unit change in Y.

\(\text{log}(Y) = \alpha + \beta X + e\)

  • a one unit change in X causes a \(100*\beta\) percent change in Y.

\(\text{log}(Y) = \alpha + \beta \text{log}(X) + e\)

  • a one percent change in X causes a \(100*\beta\) percent change in Y.

Example Autoregressive (AR) Model

We want to forecast liquor sales. However, this time series in levels is non-stationary. Instead, we’ll estimate growth rates in liquor sales and then transform back into sales. Throughout the process, visualize your transformation and test if it is stationary.

  • Step 1: Log the time series to think about percentages.
  • Step 2: Take the first difference to think about changes.
  • Step 3: Look at the ACF/PACF to identify lag structure.

Example Autoregressive (AR) Model

\[ \begin{align} log(Y_{t}) - log(Y_{t-1}) &= \Delta log(Y_{t}) \\ &= \alpha + \beta \Delta log(Y_{t-1}) + \epsilon_t \end{align} \]

If \(Y_{t-1}\) increased by 1% from \(Y_{t-2}\), we can expect an \(100*\beta\) % change from \(Y_{t-1}\) to \(Y_{t}\).

Example Autoregressive (AR) Model

To be clear, let’s fit and interpret a trend model.

Code
liquor$t <- interval(min(liquor$date), liquor$date) %/% months(1)
summary(reg_liq <- lm(log(sales_adj) ~ t, data = liquor))

Call:
lm(formula = log(sales_adj) ~ t, data = liquor)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.062226 -0.015790 -0.004704  0.016031  0.072428 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 7.4254425  0.0028650  2591.8   <2e-16 ***
t           0.0031948  0.0000148   215.8   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02632 on 334 degrees of freedom
Multiple R-squared:  0.9929,    Adjusted R-squared:  0.9929 
F-statistic: 4.659e+04 on 1 and 334 DF,  p-value: < 2.2e-16

Example Autoregressive (AR) Model

Plot the model’s fitted values.

Code
plot(liquor$date, log(liquor$sales_adj), type = "l",
     ylab = "log(Sales)", xlab = "Month")
lines(liquor$date, reg_liq$fitted.values, col = "tomato")

Example Autoregressive (AR) Model

Let’s find the average year over year growth of this variable. Then, let’s compare this to the regression coefficient.

Code
t <- 2:nrow(liquor)
pc <- (liquor$sales_adj[t] - liquor$sales_adj[t-1]) / liquor$sales_adj[t-1]

paste0("Regression Coefficient: ", round(reg_liq$coefficients[2], 5), "; Raw average: ", round(mean(pc), 5))
[1] "Regression Coefficient: 0.00319; Raw average: 0.00321"

This suggests an average .3% increase per month.

Example Autoregressive (AR) Model

Now we need to figure out how to model this process. We can use the ACF and PACF to guide our thinking.

Code
par(mfrow = c(1, 2))
acf(diff(asinh(liquor$sales_adj)), xlim = c(1, 25), main = "log(Adjusted Sales)")
pacf(diff(asinh(liquor$sales_adj)), xlim = c(1, 25), main = "log(Adjusted Sales)")

Example Autoregressive (AR) Model

Estimate an AR(1) and AR(12) model.

Code
r1 <- arima(diff(asinh(liquor$sales_adj)), c(1, 0, 0), include.mean = T)
r2 <- arima(diff(asinh(liquor$sales_adj)), c(12, 0, 0), include.mean = T)

stargazer(r1, r2, type = "html")
Dependent variable:
diff diff
(1) (2)
ar1 -0.388*** -0.448***
(0.051) (0.054)
ar2 -0.147**
(0.059)
ar3 -0.047
(0.060)
ar4 0.059
(0.059)
ar5 0.004
(0.059)
ar6 -0.008
(0.060)
ar7 0.003
(0.060)
ar8 -0.049
(0.060)
ar9 -0.084
(0.060)
ar10 -0.094
(0.060)
ar11 -0.113*
(0.059)
ar12 -0.147***
(0.054)
intercept 0.003*** 0.003***
(0.0005) (0.0003)
Observations 335 335
Log Likelihood 1,016.674 1,025.347
sigma2 0.0001 0.0001
Akaike Inf. Crit. -2,027.348 -2,022.694
Note: p<0.1; p<0.05; p<0.01

Example Autoregressive (AR) Model

Code
liquor2 <- data.frame(date = max(liquor$date) + months(1:24),
                      sales = NA, sales_adj = NA,
                      t = max(liquor$t) + 1:24)
liquor <- rbind(liquor, liquor2)
p1 <- predict(r1, n.ahead = 24, interval = "predict")
p2 <- predict(r2, n.ahead = 24, interval = "predict")

liquor$p1 <- liquor$p2 <- liquor$sales_adj
liquor$p2_u <- liquor$p2_l <- NA
w <- which(is.na(liquor$sales_adj))
for(i in 1:24){
  
  liquor$p1[w[i]] <- liquor$p1[w[i] - 1]*(1 + sinh(p1$pred[i]))
  liquor$p2[w[i]] <- liquor$p2[w[i] - 1]*(1 + sinh(p2$pred[i]))
  liquor$p2_u[w[i]] <- liquor$p2[w[i] - 1]*(1 + sinh(p2$pred[i]) + 1.645*sinh(p2$se[i]))
  liquor$p2_l[w[i]] <- liquor$p2[w[i] - 1]*(1 + sinh(p2$pred[i]) - 1.645*sinh(p2$se[i]))
}

lim <- liquor$date >= ymd("2015-01-01")
plot(liquor$date[lim], liquor$p1[lim], type = "l", col = scales::alpha("tomato", .6), lwd = 3)
lines(liquor$date[lim], liquor$p2[lim], col = scales::alpha("dodgerblue", .6), lwd = 3)
lines(liquor$date[lim], liquor$sales_adj[lim], type = "l", lwd = 3)
lines(liquor$date[lim], liquor$p2_l[lim], type = "l", col = "dodgerblue")
lines(liquor$date[lim], liquor$p2_u[lim], type = "l", col = "dodgerblue")

Example Moving Averge (MA) Model

Suppose you are a car manufacturer. Each month, you manufacture \(m\) vehicles with some error \(e_t\). \(e_t\) can be negative some employees are on vacation. \(e_t\) can be positive if an employee brings coffee for everyone one day.

\[\text{V}_t = m + e_t\]

Each time period, some fraction (\(1-\theta\)) of vehicles are sold, and some are left over (\(\theta\)).

Therefore, your inventory for every period will look like:

\[\begin{align} \text{I}_t = V_t + \theta V_{t-1} &= (m + e_t) + \theta (m + e_{t-1}) \\ &= \mu + e_t + \theta e_{t-1} \end{align}\]

Example Moving Averge (MA) Model

Maybe we can model alcohol sales as a moving average process instead of autoregressive.

Code
m1 <- arima(diff(asinh(liquor$sales_adj)), c(0, 0, 1), include.mean = T)
m2 <- arima(diff(asinh(liquor$sales_adj)), c(0, 0, 12), include.mean = T)

stargazer(m1, m2, type = "html")
Dependent variable:
diff diff
(1) (2)
ma1 -0.409*** -0.466***
(0.047) (0.056)
ma2 0.053
(0.061)
ma3 0.0003
(0.060)
ma4 0.109*
(0.060)
ma5 -0.008
(0.061)
ma6 0.070
(0.064)
ma7 0.025
(0.064)
ma8 -0.023
(0.064)
ma9 -0.026
(0.061)
ma10 -0.025
(0.057)
ma11 -0.128**
(0.060)
ma12 -0.186***
(0.064)
intercept 0.003*** 0.003***
(0.0004) (0.0003)
Observations 335 335
Log Likelihood 1,018.479 1,030.445
sigma2 0.0001 0.0001
Akaike Inf. Crit. -2,030.958 -2,032.890
Note: p<0.1; p<0.05; p<0.01

Example Moving Averge (MA) Model

Code
pm1 <- predict(m1, n.ahead = 24, interval = "predict")
pm2 <- predict(m2, n.ahead = 24, interval = "predict")

liquor$pm1 <- liquor$pm2 <- liquor$sales_adj
w <- which(is.na(liquor$sales_adj))
for(i in 1:24){
  
  liquor$pm1[w[i]] <- liquor$pm1[w[i] - 1]*(1 + sinh(pm1$pred[i]))
  liquor$pm2[w[i]] <- liquor$pm2[w[i] - 1]*(1 + sinh(pm2$pred[i]))
  # liquor$pm2_u[w[i]] <- liquor$pm2[w[i] - 1]*(1 + sinh(pm2$pred[i]) + 1.645*sinh(pm2$se[i]))
  # liquor$pm2_l[w[i]] <- liquor$pm2[w[i] - 1]*(1 + sinh(pm2$pred[i]) - 1.645*sinh(pm2$se[i]))
}

lim <- liquor$date >= ymd("2015-01-01")
plot(liquor$date[lim], liquor$pm1[lim], type = "l", col = scales::alpha("tomato", .6), lwd = 3)
lines(liquor$date[lim], liquor$pm2[lim], col = scales::alpha("dodgerblue", .6), lwd = 3)
lines(liquor$date[lim], liquor$sales_adj[lim], type = "l", lwd = 3)

ARMA and ARIMA

ARMA(p,q) - model combining AR(p) and MA(q) models.

ARIMA(p, d, q) - ARMA models that incorporate differencing.

\[\begin{align} Y_{t} = \ &\alpha + \beta_1 Y_{t-1} + \ ...\ + \beta_p Y_{t-p} + e_{t} \\ & + \theta_1 e_{t-1} + \ ... \ \theta_{q}e_{t-q} \end{align}\]

Identifying Lag Structures

Estimating ARIMA Models

The forecast package has some nice functions for estimating ARIMA models.

  • Arima() works very similar to arima() but is slightly more flexible. We will move towards this function.

  • auto.arima uses statistical criteria to select the “best” (most likely) model. Check out this resource if you are interested.

Estimating ARIMA Models

Let’s use auto.arima() to help us estimate liquor sales.

Code
reg <- auto.arima(asinh(liquor$sales_adj[!is.na(liquor$sales_adj)]), seasonal = FALSE)
reg
Series: asinh(liquor$sales_adj[!is.na(liquor$sales_adj)]) 
ARIMA(1,1,1) with drift 

Coefficients:
          ar1      ma1   drift
      -0.1448  -0.2917  0.0031
s.e.   0.1220   0.1163  0.0004

sigma^2 = 0.0001347:  log likelihood = 1019.16
AIC=-2030.31   AICc=-2030.19   BIC=-2015.06

Estimating ARIMA Models

Code
f <- forecast(reg, h=24)
plot(liquor$date, liquor$sales_adj, type = "l", lwd = 2,
     ylim = range(sinh(f$upper[,2]), sinh(f$lower[,2]), liquor$sales_adj[year(liquor$date) >= 2015], na.rm = TRUE),
     xlim = ymd(c("2015-01-01", "2022-12-01")))
lines(liquor$date[is.na(liquor$sales_adj)], sinh(f$mean),
      col = "tomato", lwd = 2)
lines(liquor$date[is.na(liquor$sales_adj)], sinh(f$upper[,2]),
      col = "tomato", lwd = 2, lty = 2)
lines(liquor$date[is.na(liquor$sales_adj)], sinh(f$lower[,2]),
      col = "tomato", lwd = 2, lty = 2)

ARCH, GARCH

(Generalized) Autoregressive Conditional Heteroskedasticity

Take stocks for example:

  • It is extremely difficult to model/forecast returns (EMH vs BF)

  • Stock Returns usually follow a random walk with a drift/trend

  • Modeling a stocks’s volatility has some merit.

ARCH, GARCH

Daily Google Stock Returns

Code
goog <- read.csv("../data/GOOG.csv")
goog$date <- ymd(goog$Date)
goog <- goog[order(goog$date),]
goog$ret <- NA
goog$ret[2:nrow(goog)] <- diff(goog$Close)

plot(goog$date, goog$ret, type = "l")

ARCH, GARCH

Code
par(mfrow = c(1, 2))
lim <- !is.na(goog$ret)
acf(goog$ret[lim]); pacf(goog$ret[lim])

ARCH, GARCH

Here are some resources for further reading:

Next Class

  • Seasonality
  • Standard Errors and Inference
  • Distributed Lags
  • Coefficient Dynamics